# Codebook: STYL 10-Year Replication Package

This document provides comprehensive documentation of variable naming conventions, global macros, and data transformations used in the replication package.

## Table of Contents

1. [Variable Naming Conventions](#variable-naming-conventions)
2. [Global Macros Reference](#global-macros-reference)
3. [Data Transformations](#data-transformations)
4. [Index Construction Methodology](#index-construction-methodology)

---

## Variable Naming Conventions

### Time Period Suffixes

| Suffix | Meaning | Survey Rounds | Example |
| ----------- | --------- | --------------- | --------- |
| `_b` | Baseline value | Pre-treatment | `income_b` |
| `_e` | Endline value (round-specific) | Any follow-up | `income_e` |
| `_stav` | Short-term average | Rounds 1-2 (2-4 weeks) | `crime_stav` |
| `_ltav` | Long-term average | Rounds 5-6 (12-13 months) | `crime_ltav` |
| `_tyav` | Ten-year average | Rounds 7-8 (~10 years) | `crime_tyav` |

### Transformation Suffixes

| Suffix | Meaning | Description |
| ----------- | --------- | ------------- |
| `_z` | Z-score | Standardized to mean=0, sd=1 |
| `_std` | Standardized | Alternative standardization notation |
| `_resc` | Rescaled | Direction adjusted (sign flipped so higher = better/more) |
| `_p` | Percentile censored | Outliers censored at 99th percentile |
| `_cov` | Covariance weighted | Index weighted by inverse covariance matrix |

### Unit/Measurement Suffixes

| Suffix | Meaning | Example |
| -------- | --------- | --------- |
| `usd` | US dollars | `incomeusd_b` |
| `7d` | 7-day reference period | `cash7d_e` |
| `4w` | 4-week reference period | `cash4w_e` |
| `2w` | 2-week reference period | `cash2w_e` |
| `prev` | Previous period | `hourprev7d` |

### Phase Indicators

| Indicator | Description | Sample Size |
| ----------- | ------------- | ------------- |
| `p1` / Phase 1 | Pilot phase | 100 participants |
| `p2` / Phase 2 | Second phase | 398 participants |
| `p3` / Phase 3 | Third phase | 501 participants |

### Robustness Variant Suffixes

| Suffix | Meaning |
| -------- | --------- |
| `_g1`, `_g15`, `_g2` | Emphasis level variants (stealing emphasis) |
| `_le1`, `_le15`, `_le2` | Less-than emphasis variants |
| `_ksf` | Alternative index construction (Kling-Scharfstein-Feller) |
| `_ss` | Shrunken estimator variant |

---

## Global Macros Reference

Global macros are defined in `dofiles/STYL_longrun_globals.do`. They group variables for use in analysis.

### Treatment and Stratification

#### `$cohorts`

**Purpose:** Treatment cohort indicator variables for randomization blocks.

```stata
redlight_p1 redlight_p2 centralmonrovia_p2 redlight_p3
centralmonrovia_p3 claratown_p3 logantown_p3 newkrutown_p3
```

#### `$strata`

**Purpose:** Stratification fixed effects for regression specifications.

```stata
I.tp_strata_alt I.cg_strata
```

### Baseline Control Variables

#### `$base`

**Purpose:** Full set of baseline control variables (90+ variables) used in main ITT specifications.
**Used in:** Tables 1, 2, A2, A3, B1-B13

**Categories included:**

- Demographics: `age_b`, `livepartner_b`, `mpartners_b`, `hhunder15_b`
- Family/Social: `famseeoften_b`, `muslim_b`
- Education: `school_b`, `schoolbasin_b`, `literacy_b`, `mathscore_b`
- Health: `health_resc_b`, `disabled_b`, `depression_b`, `distress_b`
- War exposure: `rel_commanders_b`, `faction_b`, `warexper_b`
- Economic: `profitsump99avg7d_b`, `wealth_indexstd_b`, `homeless_b`, `slphungry7dx_b`, `savstockp99_b`, `loan50_b`, `loan300_b`
- Employment: `illicit7da_zero_b`, `agricul7da_zero_b`, `nonagwage7da_zero_b`, `allbiz7da_zero_b`, `nonaghigh7da_zero_b`, `agriculeveramt_b`, `nonagbizeveramt_b`, `nonaghigheveramt_b`
- Substance use: `drugssellever_b`, `drinkboozeself_b`, `druggrassself_b`, `grassdailyuser_b`, `harddrugsever_b`, `harddrugsdailyuser_b`
- Antisocial behavior: `steals_b`, `stealnb_nonviol_b`, `stealnb_felony_b`, `cens_disputes_all_b`, `asbhostil_b`
- Personality: `conscientious_b`, `neurotic_b`, `grit_b`, `rewardresp_b`, `locuscontr_b`, `impulsive_b`, `selfesteem_b`
- Time/Risk preferences: `patience_rstd_b`, `incon_game_resc_b`, `risk_game_resc_b`, `timedecl_b`, `riskdecl_b`
- Cognition: `cognitive_score_b`, `ef_score_b`

#### `$base_balance`

**Purpose:** Controls for balance tables (Table A2). Similar to `$base` but uses `health_resc_z_b` instead of `health_resc_b`.

#### `$base_het`

**Purpose:** Controls for heterogeneity analysis (Table B3). Excludes antisocial behavior baseline variables to avoid conditioning on the heterogeneity variable.

#### `$base_no_asb`

**Purpose:** Controls excluding all antisocial behavior measures. Used when ASB is the dependent variable to avoid over-controlling.

#### `$base_small`, `$base_small2`, `$base_small3`

**Purpose:** Reduced control sets for robustness checks with fewer covariates.

#### `$base_dml`

**Purpose:** Controls selected by Double Machine Learning (Lasso) for robustness specification.

```stata
asbhostil_b stealnb_nonviol_b steals_b riskdecl_b mathscore_b
harddrugsever_b timedecl_b homeless_b mpartners_b druggrassself_b
harddrugsdailyuser_b agricul7da_zero_b impulsive_b cens_disputes_all_b
patient_game_real_b wealth_indexstd_b savstockp99_b school_b
ef_score_b rewardresp_b selfesteem_b agriculeveramt_b cognitive_score_b
```

### Outcome Variable Lists

#### Main Outcome Indices (`$dvs_t1`)

**Purpose:** Primary outcome indices for Table 1.

| Variable | Description |
| ---------- | ------------- |
| `fam_econ` | Economic performance index |
| `fam_asb` | Antisocial behavior index |
| `mech_allexceptef` | Mechanisms index (excluding executive function) |
| `timepref` | Time preferences index |
| `selfcontrolnolo` | Self-control index |
| `fam_identity` | Identity and values index |
| `fam_mental` | Mental health index |
| `subabuse` | Substance abuse index |
| `fam_network` | Social networks index |

#### Antisocial Behavior Components (`$dvs_t2`)

**Purpose:** Components of the antisocial behavior index for Table B.

| Variable | Description |
| ---------- | ------------- |
| `fam_asb` | Overall ASB index |
| `drugssellever` | Drug selling |
| `stealnb_std` | Theft (standardized) |
| `disputes_all_z_std` | Disputes (standardized) |
| `carryweapon_std` | Weapon carrying |
| `arrested_std` | Arrests |
| `asbhostil_std` | Antisocial/hostile behaviors |
| `domabuse_z_std` | Domestic abuse |

#### Economic Components (`$dvs_t3`)

**Purpose:** Components of the economic index.

| Variable | Description |
| ---------- | ------------- |
| `fam_econ` | Overall economic index |
| `cstot2wusd_std` | Consumption (2-week, USD) |
| `profitsump99avg7d_std` | Profits (7-day average, 99th percentile censored) |
| `wealth_indexstd_std` | Wealth index |
| `savstockp99_std` | Savings stock |
| `bizstocktotp99_std` | Business stock |
| `houravg7d_std` | Hours worked |
| `homeless_std` | Homelessness |

#### Time Preferences (`$dvs_t4`)

**Purpose:** Time preference index components.

| Variable | Description |
| ---------- | ------------- |
| `timepref` | Overall time preferences index |
| `timepref_p` | Patience sub-index |
| `timepref_t` | Time consistency sub-index |

#### Self-Control (`$dvs_t5`)

**Purpose:** Self-control index components.

| Variable | Description |
| ---------- | ------------- |
| `selfcontrolnolo` | Overall self-control index |
| `impulsivestd` | Impulsivity |
| `conscientiousstd` | Conscientiousness |
| `gritstd` | Grit |
| `rewardrespstd` | Reward responsiveness |

#### Identity and Values (`$dvs_t6`)

**Purpose:** Identity index components.

| Variable | Description |
| ---------- | ------------- |
| `fam_identity` | Overall identity index |
| `attcriminality_std` | Attitudes toward criminality |
| `attviolence_std` | Attitudes toward violence |
| `politicalviol_std` | Political violence attitudes |
| `appearanceindex_std` | Personal appearance |
| `prosocialindex_std` | Prosocial behavior |

#### Mental Health (`$dvs_t7`, `$dvs_t13`)

**Purpose:** Mental health index components.

| Variable | Description |
| ---------- | ------------- |
| `fam_mental` | Overall mental health index |
| `fam_mental_pos` | Positive mental health |
| `fam_mental_dep` | Depression component |
| `selfesteemstd_std` | Self-esteem |
| `wellbeing_std` | Wellbeing |
| `locuscontrstd_std` | Locus of control |
| `neuroticstd_std` | Neuroticism |
| `distress_std` | Psychological distress |
| `depression_std` | Depression |

#### Substance Abuse (`$dvs_t8`)

**Purpose:** Substance abuse index components.

| Variable | Description |
| ---------- | ------------- |
| `subabuse` | Overall substance abuse index |
| `drinkboozeever_std` | Alcohol use |
| `grassever_std` | Marijuana use |
| `harddrugsever_std` | Hard drug use |

#### Social Networks (`$dvs_t9`)

**Purpose:** Social network index components.

| Variable | Description |
| ---------- | ------------- |
| `fam_network` | Overall network index |
| `peerqualitystd_std` | Peer quality |
| `famsupportindex_std` | Family support |
| `rel_commanders_std` | Relations with commanders |
| `patronlvl_std` | Patron relationships |

### Variables for Specific Tables

#### `$asb_ind` (Table 10)

Full list of antisocial behavior indicators and their subcomponents.

#### `$timepref_ind` (Table 12)

Time preference indicators including game-based and declared measures.

#### `$forwlook_ind` (Table 13)

Forward-looking behavior indicators (impulsivity, conscientiousness, grit, reward responsiveness).

#### `$identity_ind` (Table 14)

Identity indicators including attitudes toward violence, criminality, appearance, and prosocial behavior.

#### `$econ_ind` (Table 11)

Economic indicators including profits, wealth, consumption, savings, business stock, hours, and homelessness.

---

## Data Transformations

### Currency Conversion

All monetary values are converted to US dollars using:

```
Exchange Rate: 70 Liberian Dollars (LD) = 1 US Dollar (USD)
```

This rate is applied in `dofiles/construction/STYL_longrun_cons_bl.do` (line 48).

**Note:** The source date for this exchange rate should be verified. Historical exchange rates for Liberia have varied significantly.

### Outlier Censoring

Variables are censored at the 99th percentile to reduce the influence of extreme outliers:

```stata
* Censoring at 99th percentile (among non-zero values)
sum variable, d
replace variable_p99 = r(p99) if variable > r(p99) & !missing(variable)
```

**Different censoring for different rounds:**

- Rounds 1-6: Censored at round-specific 99th percentile
- Rounds 7-8: Censored at 10-year survey 99th percentile
- Rationale: Maintains consistency with previous papers for short-term results

### Missing Data Handling

1. **Median imputation within groups:** For index construction, missing values are imputed with the median within phase groups.

2. **Zero imputation:** Some economic variables replace missing with 0 when conditional on participation (e.g., `cashsum4w_e = 0 if missing & employed==0`).

### Measurement Period Harmonization

Different survey phases used different reference periods. Harmonization:

```stata
* Converting 4-week to 2-week equivalent
replace var_e = var4w_e / 2 if var_e == .

* Converting 4-week to 7-day equivalent
replace var7d_b = var4w_b / 4
```

---

## Index Construction Methodology

Indices are constructed using the `index_maker` program in `dofiles/functions/indexmaker.do`.

### Standard Index Construction

1. **Standardization:** Each component is standardized to mean=0, sd=1 within the control group
2. **Sign alignment:** Components where higher = worse are rescaled (multiplied by -1)
3. **Aggregation:** Components are averaged (or summed)
4. **Recentering:** Final index is standardized to mean=0, sd=1

```stata
index_maker, indexname(fam_asb) indexlab("Antisocial Behavior Index") ///
    inputvars(drugssellever stealnb disputes_all carryweapon arrested asbhostil domabuse) ///
    suffix(tyav) operation(mean) samplestd(control) sampleval(1) recenter
```

### Covariance-Weighted Index (GLS)

Alternative weighting using inverse covariance matrix (Kling-Scharfstein-Feller approach):

1. Calculate covariance matrix of standardized components
2. Invert the covariance matrix
3. Weight each component by its row sum from the inverted matrix
4. Sum weighted components and normalize

See `index_cov_maker` in `dofiles/functions/indexmaker.do`.

### Wave Averaging

The `group_mean` program creates averages across survey rounds within waves:

```stata
group_mean, inputvars(outcome1 outcome2) suffixout(tyav) suffixin(e) ///
    id(partid) groupvar(round) groups(7 8)
```

This creates `outcome1_tyav` and `outcome2_tyav` as the average of rounds 7 and 8.
